Discrete Marks

Dependencies

source("utils.R")

Setup

spe <- readRDS("../data/spe.rds")

#subset the data to only look at sample ID 0.01
sub <- spe[, spe$sample_id == 0.01]
(pp <- .ppp(sub, marks = "cluster_id"))
Marked planar point pattern: 6111 points
marks are of storage type  'character'
window: rectangle = [1222.5635, 3012.4248] x [-3993.535, -2202.755] units
#split the multitype point process into several single type processes
#first, set the marks of the point process to be factors
marks(pp) <- factor(marks(pp))
ppls <- split(pp)

If not otherwise indicated, all information was taken from Baddeley et al. - Spatial Point Patterns.

#Plot the entire point process where the marks are overlayed
plot(unmark(pp), main = 'Point Pattern Unmarked')

#Plot the marks separately 
plot(ppls, main = 'Point Pattern Marks Separated')

Concepts and Definitions of Point Processes

Point Process

Complete Spatial Randomness

Complete spatial randomness (CSR) is the null model of point patterns, being the result of a poisson process. A completely random process is characterised by two properties

Homogeneity

Homogeneity means that the expected number of points falling into a given region \(B\) is proportional to its area \(|B|\) given a proportionality constant \(\lambda\). The constant \(\lambda\) is the intensity of the process, so the average number of points in a unit area.

\[ \mathbb{E}[X\cap B] = \lambda |B| \label{eq:expected_number_points} \]

Independence

Independence means that in two regions \(A\) and \(B\) the number of points \(n(X\cap A)\) and \(n(X\cap B)\) are two independent random variables. That means the number of points in region \(A\) does not affect the number of points in region \(B\). The number of points follow a poisson distribution:

\[ \mathbb{P}[N=k] = e^{-\mu}\frac{\mu^k}{k!}\\ \label{eq:poisson_process} \]

Inhomogeneous Poisson Process

A poisson process that is spatially varying in its average density of points is called inhomogeneous. Here, the average density \(\lambda(u)\) is a function of spatial location \(u\). The expected number of points falling into a region \(B\) is

\[ \mu = \int_{B} \lambda(u)du \label{eq:expected_number_inhomogeneous} \]

Isotropy

A point process is called isotropic, if its statistical properties are invariant to rotations. A CSR process is both stationary and isotropic.

Stationarity

“A point process is called stationary if, when we view the process through a window \(W\), its statistical properties do not depend on the location of the window in two-dimensional space.”

Correlation stationarity

It is important to note that the inhomogeneous metrics do not apply to every spatially inhomogeneous point process. They only apply to point processes which are correlation stationary. A point pattern is correlation stationary if the metric only depends on the relative position in subpatterns of the point process, which means that estimates of the inhomogeneous metric should be similar in different subquadrats of the point pattern.

Local scaling

The inhomogeneous K-function further assumes that while the intensity is spatially varying, the scale of the interaction remains constant. This is equivalent to the assumption that in small subregions the process is stationary and isotropic, but the rescaling factor can vary across the total process. In this case the locally scaled version of the K-function is applicable.

We can use a permutation test to test the inhomogeneity assumption. In this scenario, we split the patterns into quadrats and compare the estimatied functions between the quadrats. It has to be noted that this test highly depends on the arbitrarydefinition of the quadrats.

selection <- c('OD Mature')
pp_sel <-  subset(pp, marks %in% selection, drop = TRUE)


odrho <- rhohat(unmark(pp_sel), "x", method="tr")
odlambda <- predict(odrho)

od4 <- quantess(unmark(pp_sel), "x", 2)
od42 <- nestsplit(pp_sel, od4, ny=3)

plot(od42)

od42$inten <- factor(as.integer(od42$f1) <= 1, labels=c("Hi","Lo"))

res.scaled <- studpermu.test(od42, pts ~ inten, summaryfunction=Kscaled,
               minpoints = 10)

res.inhom <- studpermu.test(od42, pts~ inten, summaryfunction=Kinhom,
               lambda=odlambda, minpoints = 10)

#p-value of the local-scaling test
res.scaled$p.value
[1] 0.001
#p-value of the inhomogeneity test
res.inhom$p.value
[1] 0.368

Alternatively, we can inspect deviations against the hypothesis that the points were generated by a Poisson process. We can identify hotspots and coldspots by comparing the standard error of the relrisk function, which computes nonparamatric estimates of the relative risk by kernel smoothing, to the theoretical null distribution of points. The relative risk is the ratio of spatially varying probablilities of different types. Source

#TODO: We should maybe move this to after the intensity part

# select marks
selection <- c('OD Mature', 'Ependymal', 'Microglia')
pp_sel <-  subset(pp, marks %in% selection, drop = TRUE)

f1 <- pValuesHotspotMarks(pp_sel)

# Plot significant p-values
plot(f1$p, main = "Significant difference\n to Poisson process alpha = 0.05")

f0 <- pValuesHotspot(pp_sel)

# Plot significant p-values
plot(f0$p, main = "Significant difference\n to Poisson process alpha = 0.05")

Intensity

Intensity is the expected density of points per unit area, as seen above. It can be interpreted as the rate of occurrence or the abundance of events recorded. The intensity itself is called a first moment property - being related to the expected number of points.

Estimating Intensity

The intensity can be estimated regardless of the type of the point pattern. In order to do so, we sum the individual intensities of the marks

intensityPointProcess<- function(pp,mark){
  if(mark == TRUE){
    return(intensity(pp))
  }
  else{
    return(sum(intensity(pp)))
  }
}

intensityPointProcess(pp,mark=FALSE)
[1] 0.001906561

else we can look at each mark individually

intensityPointProcess(pp,mark=TRUE)
   Ambiguous    Astrocyte  Endothelial    Ependymal   Excitatory   Inhibitory 
2.411670e-04 2.015445e-04 1.463225e-04 8.361288e-05 3.681463e-04 6.130571e-04 
   Microglia  OD Immature    OD Mature    Pericytes 
3.026287e-05 6.239767e-05 1.425787e-04 1.747135e-05 

Quadrat Counting

In quadrat counting all the points falling into a given quadrat are counted. This gives an overview on the characterstics of the point pattern, such as correlation stationarity.

Q5 <- quadratcount(pp, nx=5, ny=5)
plot(unmark(pp), main='Unmarked Point Pattern Quadrats')
plot(Q5, col='black', add=TRUE)

The quadrat counts can be tested against regularity. This can happen again in the unmarked pattern or in the separated types. This tells us if the counts of the points are distributed evenly across the quadrats.

quadratTestPointProcess <- function(pp, mark){
  if(mark==TRUE){
    return(lapply(split(pp), quadrat.test, 5, alternative="regular", method="MonteCarlo"))
  }
  else{
    return(quadrat.test(unmark(pp), 5, alternative="regular", method="MonteCarlo"))
  }
}
quadratTestPointProcess(pp, mark=FALSE)

    Conditional Monte Carlo test of CSR using quadrat counts
    Test statistic: Pearson X2 statistic

data:  unmark(pp)
X2 = 245.95, p-value = 1
alternative hypothesis: regular

Quadrats: 5 by 5 grid of tiles

Kernel Estimation

In kernel estimation we try to estimate the intensity function \(\lambda(u)\) of the point process. There are different types of kernel estimators (see Baddeley). A popular choice is the isotropic Gaussian kernel where the standard deviation corresponds to the smoothing bandwidth.

Dens <- density(pp)
plot(Dens, main = 'Kernel Density')

Testing for CSR

Whether or not a point process is completely spatially random depends on two characeteristics. The points have to be distributed homogeneously and they have to be independent of each other (see definitions above). There are different ways to test for CSR which are summarised in the wrapper below.

#PRE: takes a point process and the indication, which test for CSR should be performed and potentially a covariate
#POST: returns if a point process or its individual point process marks are CSR or not.
#TODO: Change maybe to a switch statment in terms of computing time; Change lapply to mclapply later on
#TODO: There is a conceptual mistake - we pull out marks and test them against a markov simulation. It should rather be to test against all the other cells
testingCSR <- function(
    pp,
    method = c('quadrat','cdf','bermans','clark-evans','hopkins-skellam'),
    mark = FALSE,
    covariate = NULL,
    test = c('ks', 'cvm', 'ad'),
    verbose = FALSE
){
  #perform a quadrattest for individual marks or for the entire pointprocess
  if(method == 'quadrat'){
    if(mark == TRUE){
      test.result <- lapply(split(pp), quadrat.test, 5, method="MonteCarlo")
    }
    else{
      test.result <- quadrat.test(unmark(pp), 5, method="MonteCarlo")
    }
   }
  #perform a cdftest for individual marks or for the entire pointprocess given a covariate
  else if(method == 'cdf' && !is.null(covariate)){
    if(mark == TRUE){
      test.result <- lapply(split(pp), cdf.test, covariate, test=test)
    }
    else{
      test.result <- cdf.test(unmark(pp), covariate, test=test)
    }
  }
  #perform a bermans test for individual marks or for the entire pointprocess
  else if(method == 'bermans' && !is.null(covariate)){
     if(mark == TRUE){
      test.result <- lapply(split(pp), berman.test, covariate, test='Z1')
    }
    else{
      test.result <- berman.test(unmark(pp), covariate, test='Z1')
    } 
  }
  #perform a clark evans test for individual marks or for the entire pointprocess
  else if(method == 'clark-evans'){
     if(mark == TRUE){
      test.result <- lapply(split(pp), clarkevans.test)
    }
    else{
      test.result <- clarkevans.test(unmark(pp))
    } 
  }
  #perform a hopkins-skellam test for individual marks or for the entire pointprocess
  else if(method == 'hopkins-skellam'){
     if(mark == TRUE){
      test.result <- lapply(split(pp), hopskel.test)
    }
    else{
      test.result <- hopskel.test(unmark(pp))
    } 
  }
  #base case of the "switch" statement
  else{
    print("ERROR: non-specified arguments or methods")
    return(NULL)
  }
  
  #summarise the results as a mask of booleans to indicate which structures are 
  #random and which are not
  if(mark==TRUE){
    p.value.mask <- lapply(test.result, function(x) x$p.value>0.05)
  }
  else{
    p.value.mask <- test.result$p.value>0.05
  }
  #return the values of the test calculations, either just the boolean if 
  #CSR or not or the entire test statistics
  if(verbose == TRUE){
    return(test.result)
  }
  else{
    return(p.value.mask)
  }
}
result <- testingCSR(pp,method='clark-evans',mark=TRUE, verbose=FALSE)

testingCSR(ppls$Ependymal, method='clark-evans')
[1] FALSE
testingCSR(ppls$`OD Mature`, method='clark-evans')
[1] FALSE
testingCSR(ppls$Microglia, method='clark-evans')
[1] FALSE

Correlation

Correlation or more generally covariance is called a second order quantity and measures dependence between data points. This is a very useful concept, allowing for the assessment of correlation between points.

Ripley’s \(K\)

Empirical Ripley’s \(K\)

In the framework of correlation analysis we often look at distances \(d_{ij} = ||x_i-x_j||\) of all ordered points. It is a natural idea to look at the summary of these distances \(d_{ij}\), e.g. a histogram. The histogram of this point process is a difficult statistic, as it depends on the observation window \(W\), thus the histogram can change significantly with a changing window \(W\). Therefore, we look at the empirical distribution function of the distances \(d_{ij}\) that are smaller or equal than a radius \(r\)

\[ \hat{H}(r) = \frac{1}{n(n-1)}\sum_{i=1}^n \sum_{j=1\\j\neq i}^n \{d_{ij}\leq r\} \]

The contribution of each point \(x_i\) to the sum above is

\[ t_i(r) = \sum_{j \neq i} \mathbb{1} \{d_{ij}\leq r\} \]

this number \(t_i(r)\) is the number of points that fall within a radius \(r\) centered at \(x_i\). It follows then:

\[ \hat{H}(r) = \frac{1}{n(n-1)}\sum_{i=1}^n t_i(r) = \frac{1}{n-1} \bar{t}(r) \]

Here, we see what we actually want to measure is “the average number of r-neighbours of a typical random point”. This number is still dependent on the size of the observation window so we want to standardise is by the number of points and \(|W|\) the window size. Then we obtain the empirical Ripley’s \(K\) function

\[ \hat{K}(r) = \frac{|W|}{n(n-1)}\sum_{i=1}^n\sum_{j=1 \\j \neq i}^n\{d_{ij}\leq r\} \]

The standardisation makes it possible to compare point patterns with different observation windows and with different numbers of points. Using the empirical \(K\) function assumes though tha the point process has homogeneous intensity.

True \(K\)-function

instead of the summary of pairwise distances, we are interested in the point process. In order to do so, we have to think about the expected number of \(r\)-neighbours given a point \(X\) at a location \(u\) divided by its intensity \(\lambda\)

This means

\[ K(r) = \frac{1}{\lambda} \mathbb{E} [t(u,r,X)|u \in X] \]

where

\[ t(u,r,X) = \sum_{j=1}^{n(X)} \mathbb{1} \{0<||u-x_j||\leq r\} \]

This definition of the true \(K\) function is only valid if the point process is stationary. For a homogeneous poisson process we obtain

\[ K_{pois}(r) = \pi r^2 \]

Edge effects and their corrections for spatial metrics

Edge effects describe the phenomenon that we never observe the entire point process but only a part of it within a window \(W\). This means that parts of the point process at the border might not be observed and the value of the statistic biased along the edges.

There are many corrections for edge effects. They are briefly listed here

Border correction

In border correction the summation of data points is restricted to \(x_i\) for which \(b(x_i,r)\) is completely in the window \(W\).

Isotropic correction

We can regard edge effect as a sampling bias. Larger distances (e.g. close to the edges) are less likely to be observed. This can be corrected for.

Translation correction

A stationary point process \(X\) is invariant to translations. So the entire point process can be shifted by a vector \(s\) to be at the position \(X+s\).

The \(L\)-function

The \(K\)-function can be centered which is then called the \(L\)-function. The \(L\)-function is a variance-stabilising version of the \(K\)-function (see spicyR for reference).

\[ L(r) = \sqrt{\frac{K(r)}{\pi}} \]

Pair Correlation function

We have seen above, that the \(K\)-function is cumulative in nature. Meaning that the contributions of all distances smaller equal to \(r\) are counted. An alternative is to take the derivative of the \(K\)-function in order to obtain contributions of distances between points equal to \(r\).

\[ g(r) = \frac{K'(r)}{2\pi r} \]

\(g(r)\) is the probability of observing a pair of point of the process separated by a distance \(r\), divided by the corresponding probability for a Poisson process.”

Estimator of the pair correlation function

The pair correlation function can be estimated via kernel smoothing. In very large datasets the pair correlation function can be approximated using histogram-based methods.

\[ \hat{g}(r) = \frac{|W|}{2 \pi r n (n-1)} \sum_{i=1}^n\sum_{j=1 \\j \neq i}^n \kappa_h(r-d_{ij})e_{ij}(r) \]

where \(\kappa\) is the smoothing kernel. \(\kappa_h(x)\) is a rescaled version of the template kernel \(\kappa\)

\[ \kappa_h(x) = \frac{1}{h}\kappa\left(\frac{x}{h}\right) \]

In the above, \(\kappa\) can be any probability density “over the real line with mean 0”. Usually, the Epanechinikov kernel is used as smoothing kernel with half-width \(w\).

#PRE: list of point pattern, corresponding celltypes of interest, functions to evaluate
#POST: result of the metric
metricResBoot <- function(ppls, celltype, fun){
  metric.res <- lohboot(ppls[[celltype]], fun = fun)
  metric.res$type <- celltype
  return(metric.res)
}
#PRE: celltypes, function to calculation and edge correction method
#POST: dataframe of 
metricResBootToDF <- function(celltype_ls, ppls, fun, edgecorr){
  res_ls <- lapply(celltype_ls, metricResBoot, fun = fun, ppls = ppls)
  #stick all values into a dataframe
  res_df <- c()
  for(i in 1:length(celltype_ls))res_df <- rbind(res_df, res_ls[[i]])
  return(res_df)
}

#PRE: Celltypes of interest, function to analyse, edge correction to perform
#POST: plot of the metric
plotMetric <- function(celltype_ls, ppls, fun, edgecorr){
  res_df <- metricResBootToDF(celltype_ls, ppls, fun, edgecorr)
  #plot the curve
  p <- ggplot(res_df, aes(x=r, y=res_df[[edgecorr]], col= type))+
    geom_line()+
    geom_ribbon(aes(ymin = lo, ymax = hi), alpha = 0.25)+
    ggtitle(paste0(fun, '-function'))+
    geom_line(aes(x=r,y=theo, color = 'Poisson'),linetype = "dashed")+
    ylab(edgecorr) +
    scale_color_manual(name='Point Processes',
                     breaks=c('Ependymal', 'Microglia', 'OD Mature', 'Poisson'),
                     values=c('Ependymal'='red', 'Microglia'='dark green', 'OD Mature'='blue', 'Poisson'='black'))+
    theme_light()
  return(p)
}

celltype_ls <- c("Ependymal", "OD Mature", "Microglia")
p_K <- plotMetric(celltype_ls, ppls, 'Kest', 'iso')

p_L <- plotMetric(celltype_ls, ppls, 'Lest', 'iso')

p_g <- plotMetric(celltype_ls, ppls, 'pcf', 'border')
p_K/p_L/p_g

Correcting for Inhomogeneity

Inhomogeneous \(K\)-function

In the case that a spatial pattern is known or suspected to be inhomogeneous, we have to take this into account for our analysis. Biological samples display inhomogeneity very often, therefore this analysis is preferred over the homogeneous alternatives. Inhomogeneous alternatives can be calculated via:

\[ K_{inhom}(r) = \mathbb{E} \left[\sum_{x_j \in X} \frac{1}{\lambda(x_j)}\mathbb{1}\{0<||u-x_j||\leq r\}|u \in X\right] \]

This theoretical quantity can be approximated with estimators such as

\[ \hat{K}_{inhom}(r) = \frac{1}{D^p|W|}\sum_i\sum_{j \neq i} \frac{\mathbb{1}\{||u-x_j||\leq r\}}{\hat{\lambda}(x_j)\hat{\lambda}(x_i)}e(x_j,x_i;r) \]

where \(e(u,v;r)\) is an edge correction weight, \(\hat{\lambda}(u)\) is an estimator of the intensity of \(u\) and \(D^p\) is the pth power of

\[ D = \frac{1}{|W|}\sum_i \frac{1}{\hat{\lambda}(x_i)} \]

p_K <- plotMetric(celltype_ls, ppls, 'Kinhom', 'iso')

p_L <- plotMetric(celltype_ls, ppls, 'Linhom', 'iso')

p_g <- plotMetric(celltype_ls, ppls, 'pcfinhom', 'border')
p_K/p_L/p_g

The inhomogeneous \(K\)-function tells us that the microglia cells follow a Poisson process (dashed line) closely and can therefore be assumed to be randomly distributed and not clustered. Ependymal cells show a high degree of clustering at a low radius \(r\). OD mature cells are in the middle, showing a lower degree of clustering at lower values of \(r\).

The \(L\)-function is a variance stabilised (source spicyR) version of the \(K\)-function. Thus, the information is complementary to the above. Microglia cells are along the dashed poisson line, indicating no clustering of microglia cells. Ependymal cells are highly clustered at low values of \(r\), whereas OD mature show intermediate clustering at lower values of \(r\).

The pair correlation function is the derivative of the \(K\)-function. Therefore, it is not a sum of the points in the circle with radius \(r\) but rather the individual points on the radius \(r + h\) where \(h\) is very small. The pcf plot gives similar information as before: Microglia cells are around the dashed poisson line. OD Mature cells show a rather broad range of correlations between \(r \in [20,100]\). Ependymal cells have a very strong correlation at \(\sim r = 25\).

We have to note that inhomogeneity correction assumes that the process is correlation stationary, meaning that the summary statistics are the same in each quadrat. This is clearly violated at least for Ependymal cells and OD mature cells. Therefore, the question remains whether acounting for one issue (homogeneity) via a correction that assumes correlation stationarity does not just exchange one problem for another.

Local Scaling

Locally-scaled \(K\)-function

In the inhomogeneous \(K\) approach above, we assume that the local scale of the point process is not changed. However, the intensity can vary spatially. In a biological sample, this assumption is easily violated, e.g. when a gradient of cells that increases from one side to another. Therefore, we can assume that the process is subdivided in small regions. In these small regions the point process is a scaled version of a template process. This template process has to be both stationary and isotropic. For two locations \(u\) and \(v\) we would then assume that

\[ g(u,v) = g_1 \left(\frac{||u-v||}{s}\right) \]

In this example, \(g_1\) is the pair correlation function of the template process and \(s\) a scaling factor.

Locally-scaled pair-correlation function

would work by taking the derivative of the locally scaled \(K\)-function

Locally-scaled \(L\)-function

As the \(L\) is just a transformation of the \(K\)-function, the same local scaling can apply to the \(L\)-function

### need to redefine the metric function, because bootstrap is not available for locally scaled functions ###
#PRE: list of point pattern, corresponding celltypes of interest, functions to evaluate
#POST: result of the metric
metricRes <- function(ppls, celltype, fun){
  metric.res <- do.call(fun, args = list(X=ppls[[celltype]]))
  metric.res$type <- celltype
  return(metric.res)
}

#PRE: celltypes, function to calculation and edge correction method
#POST: dataframe of 
metricResToDF <- function(celltype_ls, ppls, fun, edgecorr){
  res_ls <- lapply(celltype_ls, metricRes, fun = fun, ppls = ppls)
  #stick all values into a dataframe
  res_df <- c()
  for(i in 1:length(celltype_ls))res_df <- rbind(res_df, res_ls[[i]])
  return(res_df)
}
### need to redefine the plotting function, because bootstrap is not available for locally scaled functions ###
#PRE: Celltypes of interest, function to analyse, edge correction to perform
#POST: plot of the metric
plotScaledMetric <- function(celltype_ls, ppls, fun, edgecorr){
  res_df <- metricResToDF(celltype_ls, ppls, fun, edgecorr)
  #plot the curve
  p <- ggplot(res_df, aes(x=r, y=res_df[[edgecorr]], col= type))+
    geom_line(linewidth=1)+
    ggtitle(paste0(fun, '-function'))+
    geom_line(aes(x=r,y=theo, color = 'Poisson'),linetype = "dashed", linewidth=1)+
    ylab(edgecorr) +
    scale_color_manual(name='Point Processes',
                     breaks=c('Ependymal', 'Microglia', 'OD Mature', 'Poisson'),
                     values=c('Ependymal'='red', 'Microglia'='dark green', 'OD Mature'='blue', 'Poisson'='black'))+
    theme_light()
  return(list(p = p, res_df = res_df))
}

p_K <- plotScaledMetric(celltype_ls, ppls, 'Kscaled', 'iso')$p
p_L <- plotScaledMetric(celltype_ls, ppls, 'Lscaled', 'iso')$p
p_K/p_L

The interpretation of the locally scaled \(L\)-function is similar to the interpreation of the inhomogeneous \(L\)-function. The correlation is strongest for Ependymal cells, followed by OD mature cells. Microglia cells are again close to the random poisson process. Note that here, the curves of the Ependymal and OD mature cells stay always above the dashed poisson line, unlike in the inhomogeneous version.

Given that our biological samples are both inhomogeneous and locally scaled by eye (can be tested as seen above), the locally scaled \(L\)-function seems a good variant for assessing correlation.

Local Indicators of Spatial Association

The \(K\) and \(L\)-functions above are summary statistics over the entire pattern. However, if we know that there are different regions in our point pattern, we might want to know the individual contributions of these patterns. This gives then e.g. \(n\) (for all points) local \(K\),\(L\) or pair correlation-functions. Baddeley et. al. propose to compare these \(n\) functions with e.g. functional principal component analysis. We will show here the example of the LISA version of the \(L\) function.

Local \(L\) function
L_odmature_lisa <- localL(ppls$`OD Mature`)
# plot(L_odmature_lisa, main = 'local L functions OD Mature', legend=FALSE)

df <- as.data.frame(L_odmature_lisa)

dfm <- reshape2::melt(df, "r")

get_sel <- dfm %>% filter(r > 200.1358 & r < 200.1360, variable != "theo") %>%
  mutate(sel = value) %>% select(variable, sel)

dfm <- dfm %>% left_join(get_sel)

p <- ggplot(dfm, aes(x=r, y=value, group=variable, colour=sel)) +
  geom_line() + 
  scale_color_continuous(type = "viridis") +
  geom_vline(xintercept = 200) +
  theme(legend.position = "none") +
  theme_light()

ppdf <- as.data.frame(pp) %>% filter(marks=="OD Mature")
ppdf$sel <- get_sel$sel # assume they are in same order

q <- ggplot(ppdf, aes(x=x, y=y, colour=sel)) + 
  geom_point() +
  scale_color_continuous(type = "viridis") +
  theme(legend.position = "none") +
  theme_light()
p|q

In the case of the OD mature cells we obtain further information with this plot. We note that there are two distinct populations of curves, those that are clearly above the straight poisson line and others that are around/underneath the straight poisson line. This indicates that there are two different kinds of interactions in the OD mature cells. Stronger clustering (the upper part of the plot) and more random parts (lower part).

There are inhomogeneous versions of these (e.g. localLinhom). These are not shown here for brevity.

Functional PCA for the \(n\) Curves

We apply functional PCA to retrieve the main trends in these individual curves. The idea of functional PCA is the same as for ordinary PCA just applied to the concept of functions. For the \(n\) functions above functional PCA will recover the main trends in the data

#adapted from the fdapace vignette
functional.pca.pp <- function(df){
  #df <- res_df_wider
  df_fdob <- df %>% as.matrix()
  #remove theo column - we want only the actual estimations in there without the poisson line theo
  if('r' %in% colnames(df) || 'theo' %in% colnames(df)){
     df_fdob <- df_fdob[,!colnames(df_fdob) %in% c("r", "theo")]
  }
  if('Ependymal' %in% colnames(df)){
    df_fdob <- df_fdob[,!colnames(df_fdob) %in% c("trans",'iso')]
  }
  #number of columns
  N <- ncol(df_fdob)
  #number of rows
  M <- nrow(df_fdob)
  #the x values at which all the curves were evaluated, here called tVec
  s <- df$r
  #create the FPCA object
  fd_obj <- fdapace::MakeFPCAInputs(IDs = rep(1:N, each=M),tVec=rep(s,N), yVec=df_fdob)
  print(which( unlist( lapply(fd_obj$Lt, function(x) length(x) != length(unique(x))))))
  #check that the FPCA object is valid
  fdapace::CheckData(fd_obj$Ly, fd_obj$Lt)
  #run the computation of the FPCA - would work with sparse data.
  fpca_obj <- fdapace::FPCA(fd_obj$Ly, fd_obj$Lt, list(plot = TRUE, dataType='Dense', kernel='rect'))
  fdapace::CreatePathPlot(fpca_obj,K = 3, pch = 4,showObs = FALSE, showMean = TRUE)
  return(fpca_obj)
}

fpca_obj <- functional.pca.pp(L_odmature_lisa)

fpca_pc_df <- as.data.frame(fpca_obj$xiEst) %>% rename(PC1 = V1, PC2 = V2, PC3 = V3)
fpca_pc_df$sel <- get_sel$sel # assume they are in same order

ggplot(fpca_pc_df, aes(x=PC1, y=PC2, col = sel)) + 
  scale_color_continuous(type = "viridis") +
  geom_point() +
  theme_light() +
  ggtitle("Biplot of the LISA L curves of the OD mature cells")

Here, we see the functional PCA for the OD mature cells. The Design plot tells us that we have a very dense dataset over the entire support. The mean curve displays the mean trend over all \(n\) LISA \(L\)-curves (note that this result is similar to the locally scaled \(L\)-function). The scree plot indicates that the first eigenfunction explains more than \(80 \%\) of the variance. The eigenfunction curves in the bottom right panel indicate the deviation from the mean curve.

Looking at the second plot we see the smoothed mean curve and the individual curves that are reconstructed from the first three eigenfunctions. The first eigenfunction from the bottom right panel, \(\phi_1\), is above the mean curve. When we look at the example that is clearly visible. \(\phi_2\) is first above the mean curve and then lower than the mean curve. These curves are visible as well. Lastly, \(\phi_3\) is curves that start low and pick up to be larger than the mean curve in the end. This is visible in e.g. the orange dashed line -> check the source from the Springer book “Functional Data Analysis with R and Matlab”.

The last plot shows the biplot of the functional PCA. Each point is a cell from the OD mature cells. The first two principal components are plotted. The first principal component is the x-axis and the second principal component is the y-axis. The points are coloured as in the plots of the LISA \(L\)-curves. We note that the first principal component separates the two populations. The curves with the high values (yellow) are plotted on the right and the ones with low values (blue) on the left. This corresponds to the two populations we saw in the LISA \(L\)-curve plot.

Third-Order Summary Statistics

So far we have considered first- and second-order summary statistics and local adaptations of these. In the following, we will continue a high-order statistics. In second-order statistics one considers pairs and counts these in the case of the \(K\) function. In a third-order setting we would now count triplets of points. A triplet is counted as the normalised expected value of triangles where all edges are smaller than the radius r

\[ T(r) = \frac{1}{\lambda^3}\mathbb{E}\left[\sum_{i=1}^n\sum_{j=1\\j\neq i}^nm(x_i,x_j,u) | u \in X\right] \]

here m is the maximum side of the triangle

\[ m(a,b,c) = \max(||a-b||,||a-c||,||b-c||) \]

p <- plotScaledMetric(celltype_ls, ppls, 'Tstat', 'trans')
p
$p


$res_df
Function value object (class 'fv')
for the function r -> T(r)
......................................................................
           Math.label       Description                               
r          r                distance argument r                       
theo       T[pois](r)       theoretical Poisson T(r)                  
bord.modif hat(T)[bordm](r) modified border-corrected estimate of T(r)
border     hat(T)[bord](r)  border-corrected estimate of T(r)         
trans      hat(T)[trans](r) translation-corrected estimate of T(r)    
type       T[type](r)       Additional variable 'type'                
......................................................................
Default plot formula:  .~r
where "." stands for 'type', 'trans', 'border', 'bord.modif', 'theo'
Recommended range of argument r: [0, 447.47]
Available range of argument r: [0, 447.47]

Spacing

So far, most approaches considered intensity and correlation as measures to assess a point pattern. Below, we will look at measures of spacing and shortest-distances to assess spatial arrangements.

Baddeley et.al. summarises three basic distances to measure

  • pairwise distance: \(d_{i,j} = ||x_i-x_j||\)
  • nearest-neighbour distances: \(d_i = \min_{j \neq i}d_{ij}\)
  • empty-space distance: \(d(u) = \min_j||u-x_j||\)

There are test of CSR that are based on spacing -> Clark-Evans test and Hopkins-Skellam Index that are both implemented above in the CSR testing function

Nearest Neighbour approaches

Nearest neighbour methods center around the notion of “nearness”. In particular, we introduce nndist from spatstat, a method to calculate the distances until \(k\) nearest neighbours are found. This function returns us then \(k\) curves for the \(k\) neighbour distances. We can for instance collapse this information of the \(k\) curves in a mean curve per sample. This information of the mean nearest neighbour distance is summarised as a density.

nndistance <- function(pp, nk){
  xy <- cbind(pp$x, pp$y)
  nndistances_k15 <- nndist(xy, k = nk) 
  nndistances_mean <- rowMeans(nndistances_k15)
  return(nndistances_mean)
}

#PRE: list of point pattern, corresponding celltypes of interest, functions to evaluate
#POST: result of the metric
metricRes_nndist <- function(ppls, celltype, fun){
  metric.res <- list(res = do.call(fun, args = list(pp=ppls[[celltype]], nk = seq(1:15))))
  metric.res$type = celltype
  return(metric.res)
}

#go through all defined celltypes and calculate the nearest-neighbour distance
res_ls <- lapply(celltype_ls, metricRes_nndist, fun = nndistance, ppls = ppls)
#initialise a dataframe for the metric values and the type information
res_df <- data.frame(metric = numeric(0), type = character(0))
# Loop through the res_ls list and combine the metric values with their corresponding type - ChatGPT
for (i in 1:length(res_ls)) {
  metric_values <- res_ls[[i]]$res
  metric_type <- rep(res_ls[[i]]$type, length(metric_values))
  df <- data.frame(metric = metric_values, type = metric_type)
  res_df <- rbind(res_df, df)
}
#plot the densities
p <- ggplot(res_df, aes(x=metric, col= type))+
    geom_density(linewidth=1)+
    scale_x_sqrt() +
    theme_light() +
    ggtitle('Sqrt of the Mean Nearest-Neighbour Distance')
p

In the probability-density function of the nearest neighbours the ependymal cells show the shortest nearest-neighbour distances. The OD mature cells have larger nearest neighbour distances and the bimodal distribution indicates a mix of longer and wider distances (which was as well visible in the LISA \(L\) function). Microglia cells show the widest distances and the symmetry of the curve indicates similar distances throughout the field of view.

DBScan

Often, we are interested which spatial structures build a spatial unit. One way to answer this question is to use spatially aware clustering. Here we show one very basic approach, DBScan (density-based spatial clustering with applications for noise). DBScan is an algorithm which uses a parameter called minimal Features. If the number of points per cluster is smaller than the minimal number of features, it is either noise or it is merged with another cluster. Another parameter \(\epsilon\) defines the distance at which a point has to lay in order to be part of a cluster. The value of this parameter \(\epsilon\) can be determined by an elbow plot, similar to clustering resolution parameters in other algorithms. from https://pro.arcgis.com/en/pro-app/latest/tool-reference/spatial-statistics/how-density-based-clustering-works.htm

pp_df <- as.data.frame(ppls$`OD Mature`)
#determine the correct epsilon neighbourhood
dbscan::kNNdistplot(pp_df, k=5)

Given the kNN distance plot we visually detect the “knee” of the curve to be at an distance \(\epsilon\) of \(200\). This value is needed for the computation of DBScan.

#perform DBScan
pp_dbscan <- dbscan::dbscan(pp_df[-3], eps =200, minPts = 5)
plot(pp_df[-3], col = pp_dbscan$cluster)

We see that DBScan identifies four clusters. A top cluster in red and two main side clusters in black and green. There is another cluster in blue which is at the top right corner.

another lead to follow in detected spatial clusters is scan statistics with the package .

Nearest-neighbour function \(G\) and empty-space function \(F\)

Definitions of \(F\) and \(G\) function

Under a stationary spatial point process, the empty-space distance is defined as

\[ d(u,X) = \min\{||u-x_i||: x_i \in X\} \]

The empty space function is then the cumulative distribution function of the empty-space distances defined above

\[ F(r) = \mathbb{P}\{d(u,X)\leq r\} \]

The nearest-neighbour distance is defined as

\[ d_i = \min_{j\neq i}||x_j-x_i|| \]

The nearest-neighbour distance distribution function \(G(r)\) is then defined as

\[ G(r) = \mathbb{P}\{d(x,X\backslash u \leq r |X\ has\ a\ point\ at\ u\} \]

For a homogeneous Poisson process, the nearest-neighbour distance distribution is identical to the empty-space function of the same process

\[ G_{pois} \equiv F_{pois} \]

For a general point process the \(F\) and \(G\) functions are different

Empty-space hazard

The \(F\) and \(G\) functions are, like the \(K\) function, cumulative. The same disadvantages as with the \(K\) function occur here too. Therefore, an analogue to the pair-correlation function would make sense to consider. For practical reasons this is no longer the derivative of the \(F\) function but rather a hazard rate.

\[ h(r) = \frac{f(r)}{1-F(r)} \]

For a CSR process, the hazard rate is

\[ h_{pois}(r) = 2 \pi \lambda r \]

Here we use a variance stabilising transformation as suggested by Baddeley et. al. This transformation means that if the process is completely spatial random, the hazard is equal to the intensity \(\lambda\).

\(J\)-Function

The concepts of the empty-space function \(F\) and the nearest-neighbour function \(G\) are somewhat complementary. If one decreases, the other increases. A comparison of these two functions as a measure of CSR is the Hopkins-Skellam test (implemented above).

Another approach is the \(J\) function.

\[ J(r) = \frac{1-G(r)}{1-F(r)} \]

“For a homogeneous Poisson process, \(F_{pois} \equiv G_{pois}\) such that then \(J_{pois} \equiv 1\). Values \(J(r) > 1\) are consistent with a regular pattern, and $J(r) < 1 is consistent with clustering.” (Baddeley et. al.)

### need to redefine the plotting function, because bootstrap is not available for spacing functions ###
#PRE: Celltypes of interest, function to analyse, edge correction to perform
#POST: plot of the metric
plotSpacingMetric <- function(celltype_ls, ppls, fun, x,  edgecorr){
  res_df <- metricResToDF(celltype_ls, ppls, fun)
  #plot the curve
  p <- ggplot(res_df, aes(x=res_df[[x]], y=res_df[[edgecorr]], col= type))+
    geom_line()+
    ylab(edgecorr) +
    xlab(x) +
    ggtitle(paste0(fun, '-function'))+
    geom_line(aes(x=res_df[[x]], y=theo), linetype = 'dashed')+
    theme_light()
  return(p)
}

p_G <- plotSpacingMetric(celltype_ls, ppls, 'Gest', 'r', 'rs')

p_F <- plotSpacingMetric(celltype_ls, ppls, 'Fest', 'r', 'rs')

p_J <- plotSpacingMetric(celltype_ls, ppls, 'Jest', 'r', 'rs')
p_G/p_F/p_J

Accounting for Inhomogeneity in Spacing Functions

There are inhomogeneous variants of the spacing functions explained above

p_G <- plotSpacingMetric(celltype_ls, ppls, 'Ginhom', 'r', 'bord')

p_F <- plotSpacingMetric(celltype_ls, ppls, 'Finhom', 'r', 'bord')

p_J <- plotSpacingMetric(celltype_ls, ppls, 'Jinhom', 'r', 'bord')
p_G/p_F/p_J

Again here, comparing the homogeneous versions of the functions with the inhomogeneous ones reveals, that we seem to solve one problem (inhomogeneity) by assuming correlation stationarity. As this is not given, the inhomogeneous versions don’t seem to be accurate. In fact, the homogeneous versions are more easily interpretable than the inhomogeneous alternatives.

Nearest-Neighbour Orientation

Next to the distance to the nearest-neighbours, we can estimate the orientation of the vector to the nearest-neighbour. This gives an indication of the orientation of the spacing.

p <- plotSpacingMetric(celltype_ls, ppls, 'nnorient', 'phi', 'bordm')
p

The values of \(\phi\) correspond to the orientation of the point pattern. The horizontal axis goes from \(180\) to \(0\) (left to right) and the vertical from \(90\) to \(270\) (top to bottom). We can infer that the orientation of the Ependymal nearest-neighbours is along the vertical axis, OD mature cells don’t show a clear pattern and microglial cells a horizontal orientation in their nearest neighbours with a small peak at \(\sim 180\) (orientation to the top).

The concept of spacing is not only usable in point pattern analysis but more broadly in any spatial context (e.g. spacing between shapes instead of points).

Edge Corrections

The same consideration about edge effects as for the \(K\) functions have to be made for the spacing functions. The uncorrected estimators are negatively biased as estimators for the real spacing functions. The easiest approach is to draw an artificial border and consider nearest neighbours there in. Other approaches are based on sampling. Yet another approach is based on survival analysis. The idea is that a circle of a point to grows homogeneously with increasing radius until it hits the frame of the window and “dies”. This gives survival distributions. This is similar to censored data, where the Kaplan-Meier estimator is the optimal choice.

Continuous Marks

Setup

#subset the data to only look at sample ID 0.01
sub = spe[, spe$sample_id == 0.01]
#sub = spe[, spe$sample_id == 0.01 & spe$cluster_id %in% c("Astrocyte", "Pericytes")]
(pp = .ppp(sub, marks = "cluster_id"))
Marked planar point pattern: 6111 points
marks are of storage type  'character'
window: rectangle = [1222.5635, 3012.4248] x [-3993.535, -2202.755] units
#subset the data to only look at sample ID 0.01
sub_2CT = spe[, spe$sample_id == 0.01 & spe$cluster_id %in% c("Astrocyte", "Inhibitory")]
#sub = spe[, spe$sample_id == 0.01 & spe$cluster_id %in% c("Astrocyte", "Pericytes")]
(pp_2CT = .ppp(sub_2CT, marks = "cluster_id"))
Marked planar point pattern: 2611 points
marks are of storage type  'character'
window: rectangle = [1222.5635, 3012.4248] x [-3990.104, -2204.671] units
#split the multitype point process into several single type processes
#fist, set the marks of the point process to be factors
marks(pp) = factor(marks(pp))
ppls = split(pp)
plot(ppls)

In spatstat a mark can basically take any value. It can be discrete as we have seen in the discrete mark chapter or it can take a continuos value, such as gene expression. In our example we take the gene expression of some marker genes from Fig. 6 of the original publication.

#  Genes from Fig. 6 of Moffitt et al. (2018)
gex <- as.data.frame(t(as.matrix(assay(sub))))[, c('Slc18a2', 'Esr1', 'Pgr')]
rownames(gex) <- NULL
# gene expression to marks
marks(pp) <- gex

TODO: better plotting?

plot(pp)

A pairs plot indicates spatial inhomogeneity, i.e. a spatial trend of the of the marks. This plot is only reasonable for a small number of marks however.

pairs(as.data.frame(pp), panel = panel.smooth)

The Smooth command uses cross-validation to select the smoothing bandwidth of the Gaussian kernel. This estimated kernel can be used for visual inspection of the dataset.

ppsmooth <- Smooth(pp, bw.smoothppp)
plot(ppsmooth)

From the estimated intesity plot we can see that the expression of the marker genes is clearly inhomogeneous.

Nearest-neighbour interpolations uses the nearest mark to measure the intensity at each spatial location. This is conceptually similar to taking a very small bandwith for the Gaussian kernel.

plot(nnmark(pp))

We can use the average value of neighbouring point to predict the expression of a gene at each point. We can then plot the actual marks versus the fitted values to detect anomalies. E. g. Esr1 shows a clear half moon shape in the middle of the image, where the actual values are much higher than the fitted. This gives further indication of structure in the gene expression.

mfit <- Smooth(pp, bw.smoothppp, at="points")
res <- marks(pp) - mfit

plot(setmarks(pp, res))

Summary functions

As in the discrete case the summary functions assume that the point process is stationary.

Mark correlation function

The mark correlation function measures the dependence between the marks at two points at distance \(r\). It is not a correlation in the classical sense, since it can take any nonegative value. The value of 1 indicates no correlation between the marks. The generalized mark correlation function is given by

\[ k_f(r) = \frac{\mathbb{E}[f(m(u)m(v))]}{\mathbb{E}[f(M,M')]}\] where \(f(m_1,m_2)\) is a test function with two arguments that represent the marks and returns a non-negative value. For continuous non-negative marks the choice of \(f\) is by default \(f(m_1,m_2)= m_1 m_2\). \(M, M′\) represent independent, identically distributed random points with same distribution as the mark of a randomly chosen point. This denominator is chosen such that random marks have a mark correlation of 1.

plot(markcorr(pp))

We can compare the mark correlation function to a pointwise simulation envelope in which we generate 5 simulations of random labeling.

ppEsr1 <- subset(pp, select = 'Esr1')
markcorr.Esr1 <- envelope(ppEsr1, markcorr, nsim=10)
Generating 10 simulations of CSR  ...
1, 2, 3, 4, 5, 6, 7, 8, 9, 
10.

Done.
plot(markcorr.Esr1)

The plot indicates that the mark correlation function is significantly different from the random case. The positive association of expression of the Esr1 gene declines with distance. This is consistent with clustering we saw in the residual plot above.

Mark-weigthed K function

Mark-weigthed K function is a generalization of the K-function in which the contribution of each from each pair of points is weighted by a function of their respective marks. The mark-weighted K-function is given by

\[K_f(r) = \frac 1 \lambda \frac{C_f(r)}{E[ f(M_1, M_2) ]}\] with the function \(C_f(r) = E \left[ \sum_{x \in X} f(m(u), m(x)) 1{0 < ||u - x|| \le r} \; \big| \; u \in X \right]\) which is equivalent to the unnormalized mark-weighted \(K\)-function. For every point \(u\) we sum the euclidean distance \(||u - x||\) of all other points \(x\) that are within a distance \(r\). This sum is weighted by the function \(f\) of the marks of \(u\) and \(x\). The function is standardized by the expected value of \(f(M, M′)\) where \(M, M′\) represent independent, identically distributed random points with same distribution as the mark of a randomly chosen point.

In the scenario of random labeling, the mark-weighted \(K\)-function corresponds to Ripley’s \(K\)-function.

The default function is \(f(m_1, m_2) = m_1 m_2\).

ppEsr1 <- subset(pp, select = 'Esr1')

K.Esr1L <- Kmark(ppEsr1, function(m1,m2) {m1*m2})
plot(K.Esr1L)

It is important to note that theoretical value is not very informative since it represents the \(K\)-function of a Poisson point process and the underlying point process might not be Poisson. Therefore we compare the mark-weighted with its unmarked analogue.

Here we will compare the \(L\)-functions, which are the variance stabilized \(K\)-functions.

L.Esr1L <- Kmark(ppEsr1, function(m1,m2) {m1*m2}, returnL = TRUE)
Lest.ppEsr1 <- Lest(ppEsr1, nlarge=7000)
plot(eval.fv(L.Esr1L - Lest.ppEsr1))

Like for other functions we can calculate a simulation envelope for the mark-weighted K-function to get confidence intervals.

plot(envelope(ppEsr1, fun=Kmark, returnL=TRUE, nsim=50))
Generating 50 simulations of CSR  ...
1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20,
21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40,
41, 42, 43, 44, 45, 46, 47, 48, 49, 
50.

Done.

Appendix

Session info

sessionInfo()
R version 4.3.1 (2023-06-16)
Platform: aarch64-apple-darwin20 (64-bit)
Running under: macOS Monterey 12.6.1

Matrix products: default
BLAS:   /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/lib/libRblas.0.dylib 
LAPACK: /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/lib/libRlapack.dylib;  LAPACK version 3.11.0

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

time zone: Europe/Zurich
tzcode source: internal

attached base packages:
[1] stats4    stats     graphics  grDevices utils     datasets  methods  
[8] base     

other attached packages:
 [1] stringr_1.5.0                  dixon_0.0-8                   
 [3] splancs_2.01-44                spdep_1.2-8                   
 [5] spData_2.3.0                   tmap_3.3-4                    
 [7] scater_1.28.0                  scran_1.28.2                  
 [9] scuttle_1.10.3                 SFEData_1.2.0                 
[11] SpatialFeatureExperiment_1.2.3 Voyager_1.2.7                 
[13] rgeoda_0.0.10-4                digest_0.6.33                 
[15] ncf_1.3-2                      sf_1.0-14                     
[17] reshape2_1.4.4                 patchwork_1.1.3               
[19] STexampleData_1.8.0            ExperimentHub_2.8.1           
[21] AnnotationHub_3.8.0            BiocFileCache_2.8.0           
[23] dbplyr_2.3.4                   RANN_2.6.1                    
[25] seg_0.5-7                      sp_2.1-1                      
[27] rlang_1.1.1                    ggplot2_3.4.4                 
[29] dplyr_1.1.3                    mixR_0.2.0                    
[31] spatstat_3.0-6                 spatstat.linnet_3.1-1         
[33] spatstat.model_3.2-6           rpart_4.1.19                  
[35] spatstat.explore_3.2-3         nlme_3.1-162                  
[37] spatstat.random_3.1-6          spatstat.geom_3.2-5           
[39] spatstat.data_3.0-1            SpatialExperiment_1.10.0      
[41] SingleCellExperiment_1.22.0    SummarizedExperiment_1.30.2   
[43] Biobase_2.60.0                 GenomicRanges_1.52.1          
[45] GenomeInfoDb_1.36.4            IRanges_2.34.1                
[47] S4Vectors_0.38.2               BiocGenerics_0.46.0           
[49] MatrixGenerics_1.12.3          matrixStats_1.0.0             

loaded via a namespace (and not attached):
  [1] spatstat.sparse_3.0-2         bitops_1.0-7                 
  [3] httr_1.4.7                    RColorBrewer_1.1-3           
  [5] numDeriv_2016.8-1.1           backports_1.4.1              
  [7] tools_4.3.1                   utf8_1.2.3                   
  [9] R6_2.5.1                      HDF5Array_1.28.1             
 [11] mgcv_1.8-42                   rhdf5filters_1.12.1          
 [13] withr_2.5.1                   gridExtra_2.3                
 [15] leaflet_2.2.0                 leafem_0.2.3                 
 [17] cli_3.6.1                     labeling_0.4.3               
 [19] proxy_0.4-27                  dbscan_1.1-11                
 [21] foreign_0.8-84                R.utils_2.12.2               
 [23] dichromat_2.0-0.1             scico_1.5.0                  
 [25] limma_3.56.2                  rstudioapi_0.15.0            
 [27] RSQLite_2.3.1                 generics_0.1.3               
 [29] crosstalk_1.2.0               Matrix_1.5-4.1               
 [31] ggbeeswarm_0.7.2              fansi_1.0.5                  
 [33] abind_1.4-5                   R.methodsS3_1.8.2            
 [35] terra_1.7-55                  lifecycle_1.0.3              
 [37] yaml_2.3.7                    edgeR_3.42.4                 
 [39] rhdf5_2.44.0                  tmaptools_3.1-1              
 [41] grid_4.3.1                    blob_1.2.4                   
 [43] promises_1.2.1                dqrng_0.3.1                  
 [45] crayon_1.5.2                  lattice_0.21-8               
 [47] beachmat_2.16.0               KEGGREST_1.40.1              
 [49] magick_2.8.0                  pillar_1.9.0                 
 [51] knitr_1.44                    metapod_1.7.0                
 [53] rjson_0.2.21                  boot_1.3-28.1                
 [55] codetools_0.2-19              wk_0.8.0                     
 [57] glue_1.6.2                    data.table_1.14.8            
 [59] vctrs_0.6.4                   png_0.1-8                    
 [61] gtable_0.3.4                  cachem_1.0.8                 
 [63] xfun_0.40                     S4Arrays_1.0.6               
 [65] mime_0.12                     DropletUtils_1.20.0          
 [67] pracma_2.4.2                  units_0.8-4                  
 [69] statmod_1.5.0                 bluster_1.10.0               
 [71] interactiveDisplayBase_1.38.0 ellipsis_0.3.2               
 [73] bit64_4.0.5                   filelock_1.0.2               
 [75] irlba_2.3.5.1                 vipor_0.4.5                  
 [77] KernSmooth_2.23-21            Hmisc_5.1-1                  
 [79] colorspace_2.1-0              DBI_1.1.3                    
 [81] nnet_7.3-19                   raster_3.6-26                
 [83] tidyselect_1.2.0              bit_4.0.5                    
 [85] compiler_4.3.1                curl_5.1.0                   
 [87] htmlTable_2.4.1               BiocNeighbors_1.18.0         
 [89] DelayedArray_0.26.7           checkmate_2.2.0              
 [91] scales_1.2.1                  classInt_0.4-10              
 [93] rappdirs_0.3.3                goftest_1.2-3                
 [95] fftwtools_0.9-11              spatstat.utils_3.0-3         
 [97] rmarkdown_2.25                XVector_0.40.0               
 [99] htmltools_0.5.6.1             pkgconfig_2.0.3              
[101] base64enc_0.1-3               sparseMatrixStats_1.12.2     
[103] fastmap_1.1.1                 htmlwidgets_1.6.2            
[105] shiny_1.7.5.1                 DelayedMatrixStats_1.22.6    
[107] farver_2.1.1                  jsonlite_1.8.7               
[109] BiocParallel_1.34.2           R.oo_1.25.0                  
[111] BiocSingular_1.16.0           RCurl_1.98-1.12              
[113] magrittr_2.0.3                Formula_1.2-5                
[115] GenomeInfoDbData_1.2.10       s2_1.1.4                     
[117] Rhdf5lib_1.22.1               munsell_0.5.0                
[119] Rcpp_1.0.11                   ggnewscale_0.4.9             
[121] viridis_0.6.4                 stringi_1.7.12               
[123] leafsync_0.1.0                MASS_7.3-60                  
[125] zlibbioc_1.46.0               plyr_1.8.9                   
[127] parallel_4.3.1                ggrepel_0.9.4                
[129] deldir_1.0-9                  Biostrings_2.68.1            
[131] stars_0.6-4                   splines_4.3.1                
[133] tensor_1.5                    locfit_1.5-9.8               
[135] igraph_1.5.1                  ScaledMatrix_1.8.1           
[137] BiocVersion_3.17.1            XML_3.99-0.14                
[139] evaluate_0.22                 BiocManager_1.30.22          
[141] httpuv_1.6.11                 purrr_1.0.2                  
[143] polyclip_1.10-6               rsvd_1.0.5                   
[145] lwgeom_0.2-13                 xtable_1.8-4                 
[147] fdapace_0.5.9                 e1071_1.7-13                 
[149] RSpectra_0.16-1               later_1.3.1                  
[151] viridisLite_0.4.2             class_7.3-22                 
[153] tibble_3.2.1                  memoise_2.0.1                
[155] beeswarm_0.4.0                AnnotationDbi_1.62.2         
[157] cluster_2.1.4